{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from __future__ import print_function\n",
    "import numpy as np\n",
    "import mdtraj as md"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "To benchmark the speed of the RMSD calculation, it's not really\n",
    "necessary to use 'real' coordinates, so let's just generate\n",
    "some random numbers from a normal distribution for the cartesian\n",
    "coordinates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "t = md.Trajectory(xyz=np.random.randn(1000, 100, 3), topology=None)\n",
    "print(t)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The Theobald QCP method requires centering the invidual conformations\n",
    "the origin. That's done on the fly when we call `md.rmsd()`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import time\n",
    "start = time.time()\n",
    "\n",
    "for i in range(100):\n",
    "    md.rmsd(t, t, 0)\n",
    "\n",
    "print('md.rmsd(): %.2f rmsds / s' % ((t.n_frames * 100) / (time.time() - start)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "But for some applications like clustering, we want to run many\n",
    "rmsd() calculations per trajectory frame. Under these circumstances,\n",
    "the centering of the trajectories is going to be done many times, which\n",
    "leads to a slight slowdown. If we manually center the trajectory\n",
    "and then inform the rmsd() function that the centering has been\n",
    "precentered, we can achieve ~2x speedup, depending on your machine\n",
    "and the number of atoms."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "t.center_coordinates()\n",
    "start = time.time()\n",
    "\n",
    "for i in range(100):\n",
    "    md.rmsd(t, t, 0, precentered=True)\n",
    "\n",
    "print('md.rmsd(precentered=True): %.2f rmsds / s' % ((t.n_frames * 100) / (time.time() - start)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Just for fun, let's compare this code to the straightforward\n",
    "numpy implementation of the same algorithm, which mdtraj has\n",
    "(mostly for testing) in the  mdtraj.geometry.alignment subpackage"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "from mdtraj.geometry.alignment import rmsd_qcp\n",
    "\n",
    "start = time.time()\n",
    "for k in range(t.n_frames):\n",
    "    rmsd_qcp(t.xyz[0], t.xyz[k])\n",
    "\n",
    "print('pure numpy rmsd_qcp(): %.2f rmsds / s' % (t.n_frames / (time.time() - start)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`md.rmsd()` code is *a lot* faster. If you go look at the `rmsd_qcp` source code, you'll see that it's not because that code is particularly slow or unoptimized. It's about as good as you can do with numpy. The reason for the speed difference is that an inordinate amount of time was put into hand-optimizing an SSE3 implementation in C for the `md.rmsd()` code."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
